Section: New Results
De-novo calling alternative splicing events from RNA-seq data
We addressed the problem of identifying and quantifying polymorphisms in RNA-seq data when no reference genome is available, without assembling the full transcripts. Based on the fundamental idea that each polymorphism corresponds to a recognisable pattern in a De Bruijn graph constructed from the RNA-seq reads, we proposed a general model for all polymorphisms in such graphs. We then introduced an exact algorithm, called KisSplice , to extract alternative splicing events. The first version of KisSplice appeared in 2011, but several important improvements were implemented in 2012 [24] . The first improvement was the memory consumption, the new version is much more memory efficient and can handle datasets of approximately reads. The second was in the running time, the enumeration step can now be done in parallel, which results in a significant speedup in the overall running time. Finally, an improved event quantification step was added to the method.
Application-wise, we showed that KisSplice enables to identify more correct events than general purpose transcriptome assemblers. Additionally, on a 71 M reads dataset from human brain and liver tissues, KisSplice identified 3497 alternative splicing events, out of which 56% are not present in the annotations, which confirms recent estimates showing that the complexity of alternative splicing has been largely underestimated so far.